[MGNLSTK-1323] Double-byte character is not displayed correctly in teaser carousel item Created: 17/Dec/13  Updated: 28/Jan/14  Resolved: 23/Jan/14

Status: Closed
Project: Magnolia Standard Templating Kit (closed)
Component/s: templates
Affects Version/s: 2.7
Fix Version/s: 2.7.2

Type: Bug Priority: Neutral
Reporter: Masao Suda Assignee: Federico Grilli
Resolution: Won't Fix Votes: 0
Labels: i18n
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File double-byte-japanese.png     JPEG File garbled-char-in-carouselItem.jpg     PNG File アジャイル株式会社_-_ホーム.png    
Template:
Acceptance criteria:
Empty
Release notes required:
Yes
Date of First Response:
Epic Link: Add Japanese translation

 Description   

– Repro Steps. –
1. Edit Home Page.
2. Select Stage > Teaser Group > Teaser Items > Teaser Group - Carousel Item, then click edit icon.
3. Select teaser overwrite tab, then input Japanese text.

– Problem –
Trimming for double byte character is wrong. Garbled character displayed in teaser.
abbreviateString method seems to be not support double byte characters.

– Workaround –
1. edited /templating-kit/components/teasers/carouselItem
change the followings

  • stkfn.abbreviateString(text, 370)
  • stkfn.abbreviateString(text, 330)
    to
    text


 Comments   
Comment by Jan Haderka [ 20/Dec/13 ]

Same is the problem also for Arabic and most likely Chinese.

Comment by Federico Grilli [ 20/Jan/14 ]

Hello, I tried to reproduce the issue by entering some Japanese text which is supposed to contain double-byte characters (found it on the internet and don't even know what it means ) but everything looks fine (see attached screenshot). Could you please provide some text causing the issue?

Comment by Magnolia International [ 21/Jan/14 ]

Just a hunch, but this could be another manifestation of the different unicode forms used by different platforms, NFC and NFD respectively. See http://wiki.magnolia-cms.com/display/DEV/Unicode+support+status for some details.
(perhaps a quick way of checking that would be to compare and look at the output abbreviateString of when passing an NFD and an NFC string (wiki page has a couple of samples))

Comment by Masao Suda [ 22/Jan/14 ]

This problem has occured, when I enter the following (2 lines) text.

数年前に韓国のセキュリティベンチャー企業により発明された、文字を一切使用せずアイコンのみで本人認証をさせる認証方式です。
従来の認証方式に比べて、特に覗き見耐性が高い方式になっています。

I enter the above text into the Text field of Teaser OverWrite tab.
I verified this problem both of the Windows and Linux(AWS-EC2) Enviroment.

EC2 Location is here - http://54.238.240.14/index.html (please see 2nd teaser item).

Comment by Federico Grilli [ 22/Jan/14 ]

Hello and thanks for spotting this issue. I can confirm the text looks garbled also on a Mac OS X (10.8.5) and on different browsers(Safari, Chrome, FF). The character is different still it's messed up.

Comment by Federico Grilli [ 22/Jan/14 ]

I have to recant what I said before, in that I can't reproduce the issue locally on my dev machine (Mac OS X 10.8.5) where the magnolia webapp is served by Tomcat 7.0.47 via Eclipse with -Dfile.encoding=UTF-8. Of course, I saw the issue at http://54.238.240.14/index.html and at our http://demoauthor.magnolia-cms.com (which is served by a Tomcat 7.0.47 on an Ubuntu box iirc). I also wrote a small unit test to verify the output when trimming that string at the point where the problem arises on other systems but it runs fine. So I guess it's a platform issue and Greg's hypothesis seems to be correct.

  @Test
  public void testAbbreviateDoubleByteString() throws Exception {
        // GIVEN
        int size = 46;
        String stringToCut = "数年前に韓国のセキュリティベンチャー企業により発明された、文字を一切使用せずアイコンのみで本人認証をさせる認証方式です。\n従来の認証方式に比べて、特に覗き見耐性が高い方式になっています。";

        // WHEN
        String res = stkTemplatingFunction.abbreviateString(stringToCut, size);

        // THEN
        assertEquals("数年前に韓国のセキュリティベンチャー企業により発明された、文字を一切使用せずアイコン ...", res);
    }
Comment by Federico Grilli [ 23/Jan/14 ]

I'm not sure if the chosen resolution actually fits this case. I was tempted to use "not an issue" or "cannot reproduce" but both seemed somehow incorrect. I used "won't fix" because the issue seems not related to Magnolia code or configuration, so we can't fix it.
However, to sum it up, it looks like an encoding issue and it should be enough to ensure that the encoding used by both the platform and the servlet container is UTF-8. This should solve the problem of truncating a multi-byte character incorrectly thus displaying an invalid character instead.

Comment by Federico Grilli [ 27/Jan/14 ]

@Masao Suda - Please, also take a look at http://documentation.magnolia-cms.com/display/DOCS/Language to learn more about language support in Magnolia.

Comment by Daniel Lipp [ 28/Jan/14 ]

Note to avoid confusion: this is a won't fix - the associated commits are just beautifications (formatting changes) and additions in the tests to prove that it's working as expected already.

Generated at Mon Feb 12 07:35:29 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.