Jump to content

Template:BCP47

From translatewiki.net
The documentation of this template is included from its /doc subpage.
Usage:
Normalize a language tag (with non signficant letter case) to standard BCP47. Some are used in Wikimedia and are legacy codes used in interwikis and some translations. The returned tag is using only lowercase letters. This helps categorizing pages, getting translations for language names, as well as generating conforming lang="" attributes in HTML.
Note that ISO 639-3 language codes with 3 letters that are equivalent to existing ISO 639-1 codes with 2 letters are replaced by the later shorter codes, according to BCP 47 mapping (the list is complete and stable as ISO 639-1 is frozen). As well, legacy ISO 639-2/B distinct codes (which were defined for broader bibliographic classifications in some wellknown public libraries, but not for more precise terminologic purposes in actual translations) are mapped to equivalent ISO 639-1 if they exist, or to ISO 639-2/T otherwise. Some other legacy codes with extensions are remapped to shorter codes. Obsolete codes like iw and jw are also remapped to he and jv.
Note also that not all source codes are tested as being completely valid (testing all combinations of about 7000 codes just in ISO 639-3, and hundreds of scripts or regional variants would be prohibitive). So invalid codes will just be returned as is, just minimally normalized to lowercase. However the mapping of legacy/deprecated 2-letter codes from ISO 639-1 or ISO 639-2 into newer standard codes is complete (including for prefixes used in variants), as well as legacy non-standard legacy codes still used in Wikimedia public wikis (these cases are wellknown and documented with a clear but long transition to standard codes in progress or terminated).
As well, suffixed subtags are not fully tested (not even region subtags from ISO 3166-1 or UN M.49, and script subtags from ISO 15924), only some of them used in known projects are mapped (this list, at start of the template, may be incrementally extended as needed, when found and documented in the IANA registry of language subtags for BCP 47). Note that support for legacy (but incorrect) suffix subtags -ec and -el still used in Wikimedia public wikis (instead of the correct -cyrl and -latn suffixed subtags for script variants from ISO 15924) is included for a few Serbo-Croatian languages (in strict theory this would generate a conflict with BCP 47, however these Serbo-Croatian languages have no standardized form in countries indicated by -ec and -el normally permanently reserved for ISO 3166-1, and in most cases, regional variants using region codes are being deprecated in favor of separate codification in ISO 639-3 as new language codes, or registration in the IANA registry as language-dependant dialectal/orthographic variant subtags, with at least 5 letters, or at least 4 digits for years, to avoid conflicts with other primary or suffixed subtags).
Syntax:
Parameters:
  • 1= language tag.
Examples:
See also: