[MAGNOLIA-200] Cache html has no character encoding Created: 21/Nov/04  Updated: 23/Jan/13  Resolved: 16/Aug/05

Status: Closed
Project: Magnolia
Component/s: core
Affects Version/s: 2.0 Final
Fix Version/s: 2.1 Final

Type: Bug Priority: Major
Reporter: Massimiliano Segreto Assignee: Sameer Charles
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux + jsdk 1.4.2_06


Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Bug DoR:
[ ]* Steps to reproduce, expected, and actual results filled
[ ]* Affected version filled

 Description   

The problem is with the cached html contents.

1) Save a paragraph with accented character (èéiàòù)
in any field. The character encoding will be ISO-8859-1

2) Publish the page. This invalidate the cache on the public instance.

3) Go to te public instance and browse the page. The first time you get the right characters on the page shown. If you look at the http header you will find:
---------------
HTTP/1.x 200 OK
Content-Type: text/html;charset=ISO-8859-1
Content-Length: 2244
Date: Sun, 21 Nov 2004 12:56:43 GMT
Server: Apache-Coyote/1.1
---------------

4) Clear the browser cache, navigate somewhere and then back to the page, or reload the page. You got some strange character instead of the original accented, depending on the platform/browser you use.
If you force the browser to use ISO-8859-1 (in firefox View-Character encoding-Western) the page looks fine.

The problem is: the content from cache is served compressed (gzip), without the original character set. The http header will be:
---------------------
HTTP/1.x 200 OK
Last-Modified: Sun, 21 Nov 2004 13:29:01 GMT
Content-Encoding: gzip
Content-Type: text/html
Content-Length: 968
Date: Sun, 21 Nov 2004 14:58:32 GMT
Server: Apache-Coyote/1.1
---------------------
Note that the content type has no character encoding

Workarounds I have found so far:

1) Disable cache, with impact on performance but with flexible character encoding

2) Keep the cache enabled but "hardcode" character encoding in Config
Path: /server/MIMEMapping/html/mime-type

value: text/html;charset=ISO-8859-1

so every html content will be served with ISO-8859-1 character encoding.


Generated at Mon Feb 12 03:15:05 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.