[MAGNOLIA-8709] Allow use of IETF Language Tags Created: 20/Jan/23  Updated: 08/Feb/24

Status: Accepted
Project: Magnolia
Component/s: i18n
Affects Version/s: 6.2.27
Fix Version/s: None

Type: Improvement Priority: Normal
Reporter: Mikaël Geljić Assignee: Robert Šiška
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: 1h
Original Estimate: Not Specified

Issue Links:
Relates
relates to MAGNOLIA-8229 Non JVM-default Locales are not used ... Closed
relates to MAGNOLIA-7897 Provide Compatibility And Documentati... Closed
relates to MAGNOLIA-8733 Add Locale variant support to I18nCon... Open
relates to MAGNOLIA-8667 Allow variants of locales in i18n con... Closed
documentation
to be documented by MAGNOLIA-9289 DOC: Using IETF Language Tags Accepted
relation
Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Date of First Response:
Epic Link: Support
Team: DeveloperX

 Description   

Magnolia does not support IETF Language Tags (also known as BCP 47) as languages configuration. IETF Language Tags are used in most web standards, and allow for more flexible and semantically accurate representation of language variations and local specificities.

Historically, Magnolia languages can only be configured 1:1 according to Java Locale constructs, i.e. by their respective ISO language and country codes.

Proposed solution

Support a more permissive languageTag property in languages configuration, aka LocaleDefinitions. Then we can pass that to the BCP 47-interoperable factory method Locale#forLanguageTag, instead of using the plain Locale constructor.

See full research notes on Locales and IETF Language Tags.

Original Description

reported by chris.jennings

Finer Handling of Locales

As a provider of content across the globe, I would like finer handling of locales.

Magnolia's concept of locales revolves around the pattern of two letter language code and two letter country code. ie. en_GB v en_US or de_CH v de_DE.

This does not account for languages with different scripts such as Chinese which as zh is available as Simplified Chinese and Traditional Chinese.

Making these available using currently available techniques fails.
The locale in the URL is validated using the isLocaleValid class which is given a locale object created by parsing the URL in to "language_country" in Abstracti18nContentSupport.

The result is no way of signalling to Magnolia that zh_HANS or zh_HANT is my locale. A combination of language and optional country is always assumed with no space for script.

Notes

Supported Locales Java 11 - https://www.oracle.com/java/technologies/javase/jdk11-suported-locales.html

There is also this library https://javadoc.io/static/com.neovisionaries/nv-i18n/1.2/com/neovisionaries/i18n/CountryCode.html which conforms to the ISO 3166-1 country code (See ISO 3166 Country Codes).

Would it be possible to make the library or lookup class configurable? I think in most cases what comes with the JVM is enough but perhaps I'd like to use a library which covers more codes or even use my own custom library.

Another layer is the corner case of "variant subtags". Consider Valencian where the IETF says:

Not all linguistic regions can be represented with a valid region subtag: the subnational regional dialects of a primary language are registered as variant subtags. For example, the valencia variant subtag for the Valencian variant of the Catalan is registered in the Language Subtag Registry with the prefix ca. As this dialect is spoken almost exclusively in Spain, the region subtag ES can normally be omitted.

The language subtag registry.

Unicode Technical Standard #35



 Comments   
Comment by Mikaël Geljić [ 19/Oct/23 ]

Compiled and linked research notes from jayala and myself.

Goal is to allow easy configuration and use of IETF Language Tags (also known as BCP 47), the predominant language spec in web standards, in customer projects.

Will rephrase the ticket slightly.

Generated at Mon Feb 12 04:35:05 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.