[MAGNOLIA-3306] HTTP HEAD request returns status code 403, while GET returns 200 Created: 24/Sep/10  Updated: 25/Sep/11  Resolved: 21/Dec/10

Status: Closed
Project: Magnolia
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Felix Rabe Assignee: Jan Haderka
Resolution: Not an issue Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: JPEG File Screen shot 2010-09-24 at 3.10.49 PM.jpg    
Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Bug DoR:
[ ]* Steps to reproduce, expected, and actual results filled
[ ]* Affected version filled
Date of First Response:

 Description   

For most Magnolia instances out in the wild, including the corporate website, sending a HTTP HEAD request triggers a 403 Forbidden response, but HTTP GET is just fine. See attached screenshot. (Hint: Day software gets it right, and navy.com works correctly too...)

To reproduce what I did in the screenshot, enter in a terminal:

$ nc somedomain 80
HEAD / HTTP/1.1
Host: somedomain

... (followed by an empty line to finish the header) and then comes the response from the server. Expected behaviour would be that the HEAD request gets the same response (minus content) as a GET request.

This issue was brought to my attention today when Antti wanted to find the broken download link on http://www.magnolia-cms.com/home.html using
http://validator.w3.org/checklink/, resulting in http://validator.w3.org/checklink/checklink?uri=http%3A%2F%2Fwww.magnolia-cms.com%2Fhome.html&hide_type=all&depth=&check=Check (lots of 403 errors). The link checker correctly uses HTTP HEAD requests instead of HTTP GET requests (the ones you normally do with your web browser when going anywhere).

This is how HTTP HEAD should work: (quoting http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.4)

The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request. This method can be used for obtaining metainformation about the entity implied by the request without transferring the entity-body itself. This method is often used for testing hypertext links for validity, accessibility, and recent modification.

I have tested this locally with an admin instance as well on port 8080. It does not work either:

~ $ nc localhost 8080
HEAD /magnolia-webapp-registration/.magnolia/pages/adminCentral.html HTTP/1.1
Host: localhost:8080

HTTP/1.1 403 Forbidden
Server: Apache-Coyote/1.1
X-Magnolia-Registration: Registered
Content-Type: text/html;charset=UTF-8
Content-Length: 964
Date: Fri, 24 Sep 2010 14:23:23 GMT

(A GET request gets me 401 Unauthorized, which is the correct response as I have to login first.)



 Comments   
Comment by Jan Haderka [ 21/Dec/10 ]

Either you are missing steps to reproduce or this is not an issue. Make sure the config:/server/IPConfig/allow-all/methods allows the HEAD. Default settings is to allow only GET and POST. So unless you configured HEAD as allowed the Forbidden response is correct since your Magnolia instance explicitly forbids the HEAD requests.

Comment by Mark Halvorson [ 23/Sep/11 ]

I am having this same problem. My settings seem correct in config:/server/IPConfig/allow-all/methods

However, I'm wondering if it is the anonymous role that is blocking it. Anonymous only allows get & post, could that be the reason?

https://skitch.com/halv0112/f5x7s/edit-role

Comment by Mark Halvorson [ 23/Sep/11 ]

Found that there was a difference between community edition and enterprise edition.

Community edition (where I was having the problem) config:/server/IPConfig/allow-all/methods was set to only GET,POST

on Enterprise Edition it is set to "GET,POST,HEAD,PROPFIND,PROPPATCH,MKCOL,COPY,MOVE,PUT,DELETE,LOCK,UNLOCK,OPTIONS,TRACE,CHECKOUT,CHECKIN"

Is it bad to allow all these methods on my public site? Should I change it to just GET, POST, HEAD?

Comment by Jan Haderka [ 25/Sep/11 ]

Hi Mark,
most of the websites do not require anything more then GET and POST to operate just fine. This is why the default setting includes just those 2 methods.
When you install web-dav support, all other methods have to be enabled as well for webdav to function properly. With EE you get webdav support installed as part of the bundle. This is why you see those additional methods enabled in the bundle, but not in the CE.
If you don't need webdav support on the public (and I would expect that you don't need it), then I would suggest you limit allowed methods to just those that are necessary for you site to function properly (HEAD, GET, POST).
HTH,
Jan

Generated at Mon Feb 12 03:45:13 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.