[MAGNOLIA-1512] Request charset should be set before any req.getParameter call Created: 08/May/07  Updated: 23/Jan/13  Resolved: 14/May/07

Status: Closed
Project: Magnolia
Component/s: core
Affects Version/s: None
Fix Version/s: 3.1 M2

Type: New Feature Priority: Major
Reporter: Magnolia International Assignee: Magnolia International
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
dependency
depends upon MAGNOLIA-1523 Aggregator can become a simple bean a... Closed
is depended upon by MGNLFORUM-15 Encoding issue when posting messages Closed
Template:
Acceptance criteria:
Empty
Date of First Response:

 Description   

At the moment, the request charset is set using Aggregator.getExtension(), which in turns uses Context.getAttribute. Since the WebContextImpl implementation of getAttribute calls getParameter in case the attribute does not exist on the request, thus making the setCharacterSet call useless - and preventing parameters to be parsed properly (indeed, the container will parse them on the first call to getParam*() with a default encoding - iso-8859-1 in the case of tomcat)

We might solve this by making the Aggregator.getExtension method (and associated) just using the context attribute - which have to be set previously, for instance by a new filter that has yet to be created.



 Comments   
Comment by Magnolia International [ 10/May/07 ]

A new filter will (partially) initiliase the Aggregator instance (which will be exposed in the context once MAGNOLIA-1523 is fixed) with the basic info from the request

Comment by Magnolia International [ 10/May/07 ]

Actually, with MAGNOLIA-1523 cleaning up a number of Aggregator / Path cruft, this could still all happen in the ContentTypeFilter - eventhough it will still be wrong to set the request's encoding based on what url is requested, instead of one what page it's coming from. We could base ourself on the referer's uri, but that seems unclean, too.

Normally, the browser should send a character set in the request, and if so, the container SHOULD take into account, only falling back to iso-8859-1 if that's not the case. It is still unclear what tomcat really does, but I've noticed my firefox does not send any charset header in its requests, even if the originating page has meta tags, form enctype attribute etc ...

Comment by Magnolia International [ 11/May/07 ]

done

Comment by Magnolia International [ 11/May/07 ]

reopening: make sure we don't override any existing charset... there actually ARE ways to make the browser send it! VERY ugly but works:
https://bugzilla.mozilla.org/show_bug.cgi?id=18643

Comment by Magnolia International [ 14/May/07 ]

fixed: not overriding the request encoding if already set. however, the
<input type="hidden" name="charset" /> field does not make the browser send an encoding charset within the request headers like i thought it would. All it does it set the value of that field to the encoding used in the page. Since this is very hacky, Id rather not implement support for it in the main magnolia code. If someone really needed this, they could override the setupContentTypeAndCharacterEncoding of ContentTypeFilter.

Comment by Philipp Bracher [ 15/May/07 ]

You could introduce a configurable force encoding flag.

Comment by Magnolia International [ 15/May/07 ]

It is forced, in a sense, now: the request's char.encoding is going to depend on the requested'uri extension/mimetype.
It is not set by ContentFilter only if it is already set (by another 3rd party filter or by the container for instance).
It wouldn't make sense to not set it at all (tomcat would use iso-8850-1) since we set it to serve the magnolia pages. (thus the form data is encoded by the browser in that page's response encoding)

Generated at Mon Feb 12 03:27:38 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.