Uploaded image for project: 'Magnolia'
  1. Magnolia
  2. MAGNOLIA-3008

Changes in the handling of url decoding in 4.2 break non-ascii urls that were working properly in previous versions

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • 4.3
    • 4.2, 4.2.1, 4.2.2, 4.1.4, 4.2.3
    • None
    • None

      After upgrading from 4.1 to 4.2 (4.2.3) we saw several problems related to url encoding, and looks like the changes in Magnolia 4.2 introduced some regressions.

      The related jiras are:
      MAGNOLIA-2524 - AggregationState.decodeURI is wrong - review
      MAGNOLIA-2899 - Make ServletDispatchingFilter i18n aware

      The problems: after 4.2 we were happily using virtual uris that were working properly on non-ascii url, for example:

      url: http://www.myserver.com/path/{test}
      
      
      VirtualURIMapping code:
      
      public MappingResult mapURI(String uri)
          {
                if (StringUtils.contains(uri, "{test}"))
                  {
                      // do something
      
      

      everything was used to work properly (and as expected) in magnolia < 4.2. The above expression was correctly evaluated to :

                StringUtils.contains("http://www.myserver.com/path/${test}", "{test}")
      

      After upgrading to 4.2 the url given to the virtualURI is
      http://www.myserver.com/path/%7Btest%7D

      And of course this break everything:

                StringUtils.contains("http://www.myserver.com/path/%7Btest%7D", "{test}")
      

      This also happens with accented characters (àèìòù) which are very common in italian. Also the path received by the cms filter is now undecoded: in the past the path "/già" was ending in the request of the content "/già" in the website repo (apart from the fact that magnolia strips accented characters when creating pages from the admin interface this is perfectly legal in jcr, and content loaded from xml with such paths was handled property by magnolia)... now the HierarchyManager looks for /gi%something which is definitively not good.

      Note that this also happen after tweaking the tomcat connector to use UTF8 (see http://wiki.magnolia-cms.com/display/DEV/URI+encoding+in+Tomcat ), which is something that was needed in the past to get parameters in get correctly decoded.

      Unfortunately this looks like a blocker for the upgrade from 4.1 to 4.2, I couldn't find any configuration that could restore the old behaviour and any non-ascii url (for example we used a lot of SEO friendly virtualURIs which mapped italian words...).

        Acceptance criteria

              fgiust Fabrizio Giustina
              fgiust Fabrizio Giustina
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Created:
                Updated:
                Resolved:

                  Bug DoR
                  Task DoD