[MAGNOLIA-5652] i18n property files should not need to be converted to ascii Created: 29/Jan/14  Updated: 06/Jun/18  Resolved: 28/Feb/14

Status: Closed
Project: Magnolia
Component/s: None
Affects Version/s: 5.2.2
Fix Version/s: 5.2.2

Type: Improvement Priority: Major
Reporter: Christopher Zimmermann Assignee: Daniel Lipp
Resolution: Fixed Votes: 0
Labels: i18n, support
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File Bildschirmfoto 2014-01-31 um 10.31.10.png     PNG File Bildschirmfoto 2014-01-31 um 10.31.20.png    
Issue Links:
Relates
relates to MGNLSTK-1342 Japanese translations appear garbled ... Closed
relates to MGNLUI-2641 Japanese translation for Link field "... Closed
causality
is causing MAGNOLIA-5702 i18n property files should not need t... Closed
dependency
is depended upon by MGNLUI-2558 Integrate all contributed japanese pr... Closed
relation
Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Date of First Response:

 Description   

It is currently required for the "legacy" property files, (those that are stored in resources/info/magnolia/ui/[modulename] that they be in an escaped utf format, ascii that looks like this:
menu.stkDialogs = \u30c0\u30a4\u30a2\u30ed\u30b0\u5b9a\u7fa9
Whereas the "new" property files, (thosse stored in resources/mgnl-i18n) can be in UTF-8 their native character sets.

It causes problems that only some files must be in this escaped format and it is an unnecessary extra hassle to have to convert them. The i18n implementation should be changed to accept the standard unicode character sets for all translation property files.

Currently the files often have to be translated with a tool on the command line like so:
"native2ascii -encoding utf8 messages_ja_old.properties messages_ja.properties"

Note: assigning to Greg for his review of the improvement, and suggestions on implementation.



 Comments   
Comment by Christopher Zimmermann [ 29/Jan/14 ]

MGNLUI-2641

Comment by Christoph Meier [ 31/Jan/14 ]

"New" bundles (since introduction of the new i18n-concept) are loaded with a StreamReader which specfifies the encoding.
=> see info.magnolia.i18nsystem.DefaultMessageBundlesLoader#loadMessages (l.88)
Whereas the "legacy-bundles" are loaded by info.magnolia.cms.i18n.DefaultMessagesImpl#getBundle.

If DefaultMessagesImpl is using a streamReader which specfifies the encoding, too, this should fix the problem.

Comment by Christoph Meier [ 31/Jan/14 ]

My proposed fix seems to work.
But it was a little bit more than in DefaultMessageBundlesLoader. In bundle = ResourceBundle.getBundle(getBasename(), locale);
I could not easy set the encoding. for that i had to make an inner class to extend ResourceBundle.Control
then using bundle = ResourceBundle.getBundle(getBasename(), locale, new UTF8Control());
see see http://stackoverflow.com/questions/4659929/how-to-use-utf-8-in-resource-properties-with-resourcebundle

Comment by Christoph Meier [ 31/Jan/14 ]

probably needs further review by Greg.

Comment by Daniel Lipp [ 02/Feb/14 ]

New implementation is unnecessarily complex. We should take the simplest approach.
And of course we also want a unit-test to prevent from regression.

Comment by Christoph Meier [ 02/Feb/14 ]

The important point is to read the file from the disk with specified encoding.

In the „1st part“ that’s easy
bundle = new PropertyResourceBundle(stream); reader = new InputStreamReader(stream, "UTF-8"); bundle = new PropertyResourceBundle(reader);
but on the „2nd part“ i don’t see another possibility than using custom ResourceBundle.Control

But: The question is, why and when the „else-part“ (2nd part as mentioned above) is ever used. Would be nice to get rid of, if possible.

Comment by Magnolia International [ 03/Feb/14 ]

To try and clarify some of the above, and add my grain of salt:

  • "the first part", "the second part", refers to code where we 1) try to find the message files ourselves via the classloader and pass a stream to PropertyResourceBundle or 2) when we can't find the stream, fallback to ResourceBundle.getBundle, which afaik does essentially the same, but might be using a different (set of) classloader(s).
  • if there's indeed cases where this code is used, I suspect it'll be with different containers (JBoss has a tendency to be particular with how it handles classloaders, for instance). Removing that code would be risky, but would help us identify the conditions where it happens faster It would also make this class consistent with the "new" i18n framework (DefaultMessageBundlesLoader)
  • testcase: good idea; I'd suggest one of two things: encode the test string in ascii, so we avoid failures if the test source class itself gets corrupted/mis-encoded; perhaps even better: add a selfTest() where we assertEquals("追加する", "\u8ffd\u52a0\u3059\u308b") - validate the test with a test
Comment by Christian Ringele [ 28/Feb/14 ]

Seems its not working for SUPPORT-3268, reopening issue.

Comment by Christian Ringele [ 28/Feb/14 ]

Closing again, should have created new issue: MAGNOLIA-5702

Generated at Mon Feb 12 04:07:08 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.