[MGNLEE-207] Random encoding problems with Apache and GZIP filter Created: 23/Jun/11 Updated: 17/Dec/12 Resolved: 17/Dec/12 |
|
| Status: | Closed |
| Project: | Magnolia DX Core |
| Component/s: | None |
| Affects Version/s: | 4.4.3 |
| Fix Version/s: | 4.5.7 |
| Type: | Bug | Priority: | Major |
| Reporter: | Leo Lozes | Assignee: | Jan Haderka |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | apache, encoding, filter, gzip | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
System: Amazon EC2 OS: Oracle Enterprise Linux 5.3 x86_64, with EPEL and CentOS Yum repositories |
||
| Attachments: |
|
||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||
| Template: |
|
||||||||||||||||||||||||
| Acceptance criteria: |
Empty
|
||||||||||||||||||||||||
| Task DoD: |
[ ]*
Doc/release notes changes? Comment present?
[ ]*
Downstream builds green?
[ ]*
Solution information and context easily available?
[ ]*
Tests
[ ]*
FixVersion filled and not yet released
[ ] 
Architecture Decision Record (ADR)
|
||||||||||||||||||||||||
| Bug DoR: |
[ ]*
Steps to reproduce, expected, and actual results filled
[ ]*
Affected version filled
|
||||||||||||||||||||||||
| Date of First Response: | |||||||||||||||||||||||||
| Description |
|
We are having some random scrambled resources / pages in our testing environment. This seems to be caused by an incompatibility between the apache / magnolia configuration with gzip. Apache VirtualHost configuration: <VirtualHost *:80> GZIP configurations tested: Direct HTTP Magnolia with gzip - OK What does WRONG mean? Randomly, resources get scrambled. Sometimes the main HTML resource, sometimes one (or more) CSS resources, or maybe JavaScript. The pages appear generally broken in random, funny ways. But not always. If you reload the same page over and over again it changes almost every time. Then it looks OK. Then it's broken again. Then it's OK three times in a row. Etc. What does SCRAMBLED mean? The resources look like random binary gibberish, but almost certainly NOT pure gzip-compressed data. With lots of UTF "unknown character" byte sequences, EF BF BD in hexadecimal (usually represented as a square standing on one vertex with a question mark in the middle). |
| Comments |
| Comment by Christian Ringele [ 23/Jun/11 ] |
|
I have the same issue on Windows2003, Apache 2.1, mod_proxy_ajp not in use (just mod_jk.so & JKMount). Cause of the problem seems to be a change between 4.3.x and 4.4.x |
| Comment by Jan Haderka [ 23/Jun/11 ] |
IMHO this configuration can't work ever. Magnolia sees that client accepts gzip (as apache doesn't strip off that info from request header) and therefore encodes the response. However Apache strips of the gzip encoding info w/o actually decoding the response data, hence clients get encoded data w/o knowing that it is encoded. Some browsers tend to analyze incoming data and would figure it out and decode anyway, but others would not. Also when you reload the page and the incoming data doesn't seem correct to the browser, sometimes it chooses to display previously cached correct version of the page (I've seen this with IE and Safari). My recommendation would be to always keep setting of the Apache and Magnolia in regard of encoding in sync. There is currently no way for Magnolia to figure out configuration of the Apache server that is in front of it automatically. |
| Comment by Leo Lozes [ 23/Jun/11 ] |
|
We attached the headers of two petitions / responses, one with gzip activated in Apache, and the other one without it. As you can see, the two headers are exactly the same (except the time, and it's not photoshop So we don't really understand your comment "However Apache strips of the gzip encoding info" ... |
| Comment by Jan Haderka [ 23/Jun/11 ] |
|
Thx for the extra info. I would expect header being stripped off to cause the issue in this case. Few more questions:
|
| Comment by Christian Ringele [ 24/Jun/11 ] |
|
I have the same behavior on a Apache 2.1 with no specific GZip encoding configuration. |
| Comment by Jan Haderka [ 24/Jun/11 ] |
any reason why you are using even older version of Apache?
by x you mean the latest ones? 4.3.8 and 4.4.4? |
| Comment by Christian Ringele [ 24/Jun/11 ] |
|
Ok I checked on the system again, and good you asked Apache 2.2.6 and not using mod_proxy_ajp or any specific gzip configuration. So here is the definitive list of the used instances all served trough the same apache: |
| Comment by Leo Lozes [ 27/Jun/11 ] |
|
Hi Jan, 1. The full configuration of Apache is in our first message, we didn't configure it to put the Apache 2.2.3 header, it's only a "gateway" to redirect the petitions to the corresponding tomcat. And we'll try to get the response examples you ask Thanks |
| Comment by Leo Lozes [ 27/Jun/11 ] |
|
Here you have screenshots of both headers and content with scrambled pages and "good" pages. |
| Comment by Jan Haderka [ 04/Jul/11 ] |
|
headers in the attached rar file look different then the ones you originally submitted with this issue. In the attached rar file, when content is scrambled, the response headers do not contain content-encoding (gzip), transfer-encoding header is set to chunked and Magnolia registration header is missing completely. I believe in this case it is apache sending cached gzipped content w/o the headers that would allow browser to recognize that the content is actually gzipped. |