[MAGNOLIA-3333] cache: improve logging on blocking timeouts and unlock in the cache filter on exceptions Created: 22/Oct/10 Updated: 17/Aug/12 Resolved: 19/Jan/11 |
|
| Status: | Closed |
| Project: | Magnolia |
| Component/s: | cache |
| Affects Version/s: | None |
| Fix Version/s: | 4.4.2, 4.5 |
| Type: | Bug | Priority: | Major |
| Reporter: | Magnolia International | Assignee: | Philipp Bärfuss |
| Resolution: | Fixed | Votes: | 2 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Template: |
|
||||||||
| Acceptance criteria: |
Empty
|
||||||||
| Task DoD: |
[ ]*
Doc/release notes changes? Comment present?
[ ]*
Downstream builds green?
[ ]*
Solution information and context easily available?
[ ]*
Tests
[ ]*
FixVersion filled and not yet released
[ ] 
Architecture Decision Record (ADR)
|
||||||||
| Bug DoR: |
[ ]*
Steps to reproduce, expected, and actual results filled
[ ]*
Affected version filled
|
||||||||
| Date of First Response: | |||||||||
| Description |
|
After a longer discussion we have done the following:
|
| Comments |
| Comment by orlando warren [ 02/Dec/10 ] |
|
We're having the same problem in production (and had to rollback because of it). Here's a stack trace: Caused by: net.sf.ehcache.constructs.blocking.LockTimeoutException: Lock timeout. Waited more than 10000ms t |
| Comment by orlando warren [ 02/Dec/10 ] |
|
btw, we're using Magnolia 4.3.8 |
| Comment by Philipp Bärfuss [ 02/Dec/10 ] |
|
What happens is the following:
The main question is why is the first request not returning. You can best find out by making a thread dump (feel free to attach it here): -> http://expertodev.wordpress.com/2009/05/30/how-to-take-java-thread-dump/ Could it be that the request is just too slow (below 10 seconds?) |
| Comment by orlando warren [ 03/Dec/10 ] |
|
We're having a hard time reproducing the issue on our QA environment. Hopefully when we do I can send it along. public void doFilter(HttpServletRequest request, HttpServletResponse response, FilterChain chain) throws IOException, ServletException { log.debug("Cache policy result: {}", cachePolicyResult); final CachePolicyResult.CachePolicyBehaviour behaviour = cachePolicyResult.getBehaviour(); executor.processCacheRequest(request, response, chain, cache, cachePolicyResult); // TODO if the cache blocks we will have to add this again. The comment seems to indicate that was what the finally was for. What do you think? Why did you comment out the code? Thanks. |
| Comment by Craig Blake [ 06/Dec/10 ] |
|
From looking at the code more, it appears that the finally clause (which use to be in the CacheFilter to handle the case where EhCache throws an exception and needs to have the lock explicitly released) was moved into the Store where the main cache get happens. However, there are other methods which call get() on the underlying blocking cache and which need to have non-timeout exceptions handled to prevent a deadlock, including the method Default.shouldCache() which is called on line 111 of CacheFilter. From what I can see when Default.shouldCache() calls get() on line 96, if a non-timeout exception is thrown, it is not handled and will result in deadlock on the cache key. Are we missing something? |
| Comment by Philipp Bärfuss [ 24/Dec/10 ] |
|
attaching an untested [patch|^magnolia-module-ache.patch] which:
Attention: the patch is based on Magnolia 4.3 |
| Comment by Jan Haderka [ 03/Jan/11 ] |
|
Can you add junit or integration test that would reexecute scenario described by Philipp and ensure that it is really solved by the patch? |
| Comment by Philipp Bärfuss [ 18/Jan/11 ] |
|
Changes are implemented but not back-ported to 4.3. This mostly improves the logs of such occasions. |
| Comment by Jan Haderka [ 19/Jan/11 ] |
|
Fix versions? Patch was also applied on 4.3 branch and not reverted from there (According to the log comments). |
| Comment by Philipp Bärfuss [ 19/Jan/11 ] |
|
I updated the title and description. The original description was:
|
| Comment by Philipp Bärfuss [ 19/Jan/11 ] |
|
The revert on 4.3 was done in revition 41282. Sorry for the lousy commit message. |