Multimedia:Language selector

From Wikimedia Usability Initiative

We need a usable way to select languages on Wikimedia projects.

Current system

The current system provides a menu listing many languages, displayed in their preferred name in that language if available, but ordered by the ISO 639 code.

Good points:

  • sorting by ISO codes will naturally group language variants together, such as Chinese Traditional (zh-tw) and Chinese Simplified (zh-cn).
  • Speakers of the language see the name of the language in the most common form.
  • Each language is treated equally

Bad points:

  • The ordering is very unintuitive, since the ISO codes aren't even shown and sometimes bear no relationship to what the user is searching for. Users may assume that they should be looking for the name in English, or in the interface language. In some informal usability testing, we've seen users confused why "Fijian Hindi" is listed in English but Hindi is listed in Devanagari script.
    • (Response: All names are listed in the interface language currently, including Fiji Hindi. The reason it appears in Latin alphabet is that the Fiji Hindi Wikipedia uses almost exclusively Latin alphabet.) --Node ue 22:13, 7 June 2010 (UTC)
  • The very long number of options is time-consuming to scroll, even if you know what you're looking for.

Requirements?

  • Treat every language fairly.
  • (Unclear) do not privilege languages just for being popular. (?) There may be ways to speed up the interface for common languages without impeding uncommon languages.
  • Include every reasonable language and subgroup (although, Wikimedia has significantly departed from ISO standards...)
  • As far as I am aware, we have only departed from ISO standards in a handful of very limited cases which are now "grandfathered in", can you please elaborate? --Node ue 22:15, 7 June 2010 (UTC)
  • FYI: We currently have roughly 20 out of 300+ cases of abbreviations for languages which, or the uses of which, are not 100% conforming to the standard BCP47. About a dozen aof those are planned to be changed to standard ones sooner or later, Some of those are only used at specific places and do not appear elsewhere. --Purodha 01:40, 27 November 2010 (UTC)
  • Support, where necessary, bilingual users. Should be easy to switch between your preferred languages.

Use cases

  • Input Wikimedia Commons is inherently multilingual. All media items have a description field that may be split into multiple languages. The only time a "switch" between languages is necessary is when entering such descriptions.
  • Linking -- this might not be relevant to the question of how to input data? The Wikipedias each cover just one language. However, there are cross-language-wiki links maintained between articles that are deemed to be about the same thing. In the past this has just been a list of links in the left-hand column, ordered in the traditional (bad) ISO 639 code manner.

Approaches taken by other sites

  • Show every language offered, in its own script and language. Many sites offer a drop-down list like this. However it is difficult to scale. Facebook takes it the furthest by showing every language they offer in a large grid.
Facebook (circa mid 2010) offers the site internationalized to about 80 languages, some of which are in beta. Their interface selection widget pops up a matrix of choices, which show the name of the language localized to that language. This has the advantage that users can scan the entire list all at once. Multiple ordering approaches seem to be used, with most European languages sorted lexicographically, the Chinese variants grouped together, Japanese off by itself, and the others consigned to a column of miscellaneous languages.
A better explanation of that: Korean, Chinese, Thai and Japanese are sorted according to English alphabetical order by their native name (Hangukmal, Zhongwen, Thai and Nihongo respectively) --Node ue 22:16, 7 June 2010 (UTC)
  • Show every language offered, in the current user interface language AND in its own script and language. Google does this when setting language preferences in certain places. It's probably the most usable.
Most sites comparable to Wikimedia Commons typically only offer translations in ten or perhaps twenty languages. At the time of writing Wikimedia Commons has 356. So this common solution is not going to work for us. While we could probably get what Spanish is in English and vice versa, discovering the Quechua name for Japanese may be flat out impossible.
Unilang -- has a drop down menu with English name of language, followed by name of language localized in that language. Sorted by English name of major language family, but then dialects and subgroups are sorted beneath that no matter what their English name is. Although the rest of the site has a localizable interface this menu is always displayed in English. (wisely avoiding the n**2 problem of translating every language in every language). However, we don't have the option of privileging English.
This is probably not an acceptable solution for us. This gives English a favored status, which may be acceptable for Unilang as an organization but is not for us. Accusations of attempted imposition of Anglocentric cultural hegemony will arise very quickly if people on the French Wikipedia see English names for all languages. --Node ue 22:19, 7 June 2010 (UTC)
Node ue, what about "in the current user interface language" seems anglocentric to you? NeilK 00:18, 8 June 2010 (UTC)
Node ue: never mind, I see you're reacting to the Unilang example. That was just the closest thing I could find to what I'm describing. I agree it's not acceptable for us. NeilK 00:22, 8 June 2010 (UTC)
The n**2 problem is basically being solved by the UNICODE CLDR [1], and http://translatewiki.net/ is using a language picker inside MediaWiki that uses its data. A drawback may be that the CLDR data is not very complete yet, and it has a very slow update cycle at the moment between half a year and three years. I believe that upgrading the language picker to show additional infor mation, such as the language code and/or the language names in the interface language would be pretty straightforward. Tell me if you want me to investigate that further, or make it. --Purodha 01:59, 27 November 2010 (UTC)

Other resources

  • ImageXMedia has a good resource on approaches taken by various sites, what to do, and what not to do.

Ideas

Predictive text input

Guillaume proposed a predictive typing interface that matches on all the strings we know of, plus the ISO 639 code. So one could find the Chinese language by typing "ch" to find its English name, "zh" for its language code, or by typing 中文. Facebook and Flickr use similar interfaces to quickly search for contacts, and this approach has been proven to scale to hundreds of entries.

This is a clever solution since it allows us to make the best use of incomplete information. If we don't have the Swahili translation of Japanese, we don't include it as a possible input. If we *do* happen to have it, great.

This is a largish subproject with uses outside Multimedia usability (or even Wikimedia generally) which may need its own scoping, probably its own extension.

However, we don't have to code this from scratch. There exist some standard widgets for predictive search, or remote AJAX search from text entry.

Geographically-oriented input

The idea here is to offer users a way to navigate through languages in a value-neutral manner.

1) We would divide up the world into hierarchical regions, and obtain translated names for that short list of regions, in as many languages as possible. 2) Then, we offer a hierarchical menu, starting with the continents, and then show the user languages which are common in that region. Here's an example of looking for Spanish while browsing with an English interface.

  • South America
  • Portuguese (Brazilian)
  • Spanish

OR

  • Europe
  • Spain
  • Catalan
  • Spanish

While this is possibly a convenient way to pick a language, another solution must be devised to show the user's choices (particularly if they make multiple selections, as when setting preferences).

Problems with many of the proposals on this page

It seems that many of these proposals presuppose that all users looking for a language have some degree of proficiency in English, which is not necessarily the case. If I only speak Quechua, how am I going to know to look under English-language labels like "South America" to find it? For a reverse example... if you're looking for English, would you know to choose between the labels "ევროპის", "აფრიკა", "აზია", "ჩრდილოეთი ამერიკა" and "სამხრეთ ამერიკა" (the names of the continents in Georgian) to find your language? Any solution must take into account the fact that there are many, many people on this planet who are monolingual in languages other than English and cannot understand even one single short word of the language. --Node ue 22:24, 7 June 2010 (UTC)

We are trying to craft proposals which aren't English-specific. I see your point, which is actually orthogonal to geography or English -- you're saying that the current interface language may not be a language that the user speaks, so they can't navigate a menu, of any kind, unless they see some string of characters they understand. If we can't put every language on the screen, they have to be able to guess that there is such a widget, and intuit how to use it without words. But, if we assume we have to have a language-interface-switcher on every page, the only solution is to have every language on every page, in some way, as a menu, popup, predictive text input, or paragraph of 356 links. Is there an elegant solution to this? An icon with little word balloons coming off of a globe, that pops up some sort of widget? NeilK 00:02, 8 June 2010 (UTC)
Well, my example was merely written from the point of view of a non-English speaker who visits a page on the English Wikipedia, but it could be applied to anybody at a site in a language they don't understand. As far as an elegant solution, any solution has to be very, very easy to figure out, which icons might not necessarily be. Another possible downside to anything that is hidden is that it takes an extra click to find out if a version in your language is available... say I prefer Swahili, I'm at the page w:Baby boomer... I want to view the article in my language, but it's not there. With a language like Swahili, in which it's relatively uncommon to find Interwiki links from en.wp (sw.wp has 18K articles compared to millions on en.wp), users would probably not even bother. Having the language name there, announcing itself, is like stumbling upon buried treasure for many people.
As far as having 356 links, that could be a problem in the future, but it isn't right now. See [2] - at the English Wikipedia, only about 3% of articles have more than 13 interwikis, and only a couple thousand articles have more than 50. I agree that a better solution should be found than having 250+ links, but we have to try to do that without sacrificing ease-of-use. --Node ue 00:12, 11 June 2010 (UTC)