[MGNLADVCACHE-17] ServeUntilRecachedPolicy and NotifyFlushListeningPolicy don't work with policies sets Created: 19/Apr/11  Updated: 06/Jun/11  Resolved: 06/Jun/11

Status: Closed
Project: Advanced Cache
Component/s: None
Affects Version/s: 1.3
Fix Version/s: 1.3.2

Type: Bug Priority: Major
Reporter: Christian Kutschke Assignee: Jan Haderka
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: JPEG File cache configuration.jpg     Text File cache-debug.log     XML File config.modules.advanced-cache.xml     XML File config.modules.advanced-cache_author.xml     XML File config.modules.advanced-cache_public.xml     XML File config.modules.cache.config.configurations.default.xml     XML File config.modules.cache.config.xml     XML File config.server.filters.xml     Text File magnolia-cache.log     Text File magnolia-cache.log    
Issue Links:
causality
caused by MAGNOLIA-2677 Caching big content may cause OutOfMe... Closed
dependency
depends upon MAGNOLIA-3686 Expose original URL and modification ... Closed
supersession
Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Bug DoR:
[ ]* Steps to reproduce, expected, and actual results filled
[ ]* Affected version filled
Date of First Response:

 Description   

We have got a problem with the advanced cache in our Magnolia 4.4.2 EE setup.
We have setup the Advanced Cache module according to http://documentation.magnolia-cms.com/modules/advanced-cache.html with /modules/cache/config/configurations/default/cachePolicy/class set to info.magnolia.module.advancedcache.ServeUntilRecachedCachePolicy and /modules/cache/config/configurations/default/flushPolicy/class set to info.magnolia.module.advancedcache.NotifyFlushListeningPolicy.

In our developing/testing environment this works fine. The problems occur on our productive environment:

  • we have 1 author instance with 2 subscriber instances (author.domain.de publishes to public01.domain.de and public02.domain.de)
  • the subscribers are behind a netscaler that divides the load between the 2 public nodes
  • browsers access www.domain.de and get either node 1 or 2

The problem seems to be, that the cache is never invalidated:

  • page in version 1 is published
  • browser gets page via netscaler in version 1
  • page in version 2 is published
  • accessing the 2 publish nodes directly delivers the correct response
  • accessing them via netscaler shows the old (cached) version indefinitely --> it is never invalidated

Two observations:

  • only flushing the magnolia cache manually solves the problem (until the next update) so it must definitely be the magnolia cache
  • removing one server node from the netscaler (and thus effectively reducing it to a forward to the other node) yields the same problem

One more thing: as I understand it, the keys for the cache are the URLs of the pages. Could it be, that our setup with 2 public nodes with distinct URLs (which work fine cache wise) are accessed through another URL (via netscaler) generates problems with this caching policy? Such as, the URL accessed (www.domain.de/test.html) is never invalidated, the 2 nodes only invalidate their own respective caches with keys/URLs like public01.domain.de/test.html and public02.domain.de/test.html).

Page headers are set to "don't cache at all" (i.e. "Expires = 01.01.1970", "Cache-Control = no-cache, no-store, must-revalidate, max-age=0", "Pragma = no-cache") and still the old content is shown in the browser. So it can't be the netscaler caching content based on headers. Also browser cache can't be an issue, because it is disabled.

Is this a common issue? How do we work around it? We would really like to use the ServeUntilRecachedCachePolicy as it fits our requirements the best.

Read the enclosed logfile as follows:

  • entries at 15:29:xx contain a activation of test content and accessing the test page on the public nodes (with the old version still shown)
  • entries at 15:32:xx contain a flush of magnolia cache and accessing the test page on the public nodes (with the new version correctly delivered)


 Comments   
Comment by Christian Kutschke [ 27/Apr/11 ]

Do you need mor information? Has anyone looked in this issue yet?

Comment by Jan Haderka [ 11/May/11 ]

could you please provide me with your exact cache and filter chain configuration? Go to Configuration and export config:/server/filters and config:/modules/caches/config from the public instance and attach it to this issue or to the related support ticket. Since you get the correct response when accessing Magnolia nodes directly I'm bit at loss what the cause could be.
Can you also share more details about what kind of load balancer (netscaler) you use and how it is configured? Does it use sticky sessions? Is it configured to just distribute the load or does it do some additional caching?
What happens when you disable Magnolia cache completely or use different cache strategy?

Comment by Jan Haderka [ 11/May/11 ]

Renaming of the CachedPage to CachedEntry invalidates checks in advance cache strategies, thus making such strategies unusable since Magnolia 4.4.

Comment by Christian Kutschke [ 11/May/11 ]

Some additional information on the netscaler:

  • Works as reverse proxy
  • Sticky sessions are enabled
  • no caching is done

Manual test on the tomcat server (one of the productive nodes):

  • telnet to local http connector
  • GET /test.html HTTP/1.1
  • Host: lhm.muenchen.swm.de

Varying the host header (e.g. to the specific node host names) yielded different results. If the host header matches the subscriber configuration latest content is returned, using the cluster name old content from cache is served.

Comment by Jan Haderka [ 12/May/11 ]

Varying the host header (e.g. to the specific node host names) yielded different results. If the host header matches the subscriber configuration latest content is returned, using the cluster name old content from cache is served.

yeah, that makes sense. The cache key is composed of the server name among other things so using the original name retrieves old unflushed cached content. While the local name of the node results in different key thus generating new entry and masking the original problem.

Thanks for all the info.

Comment by Christian Kutschke [ 27/May/11 ]

We updated our setup to Magnolia EE 4.4.4 and Advanced Cache 1.3.1 as suggested. Our setup ist still the same:

  • 1 author publishing to two separate subscribers
  • access the public cluster through a different URL --> old content
  • access the server node directly --> new content

So basically the same result - from my point of view nothing has changed. Could you please look into it again or give me details on what you changed in the advanced cache module?

Best regards,
Christian

Comment by Jan Haderka [ 28/May/11 ]

could you please clean up the cache manually, then enable debug level logging for cache and attach the log file from the try to this issue?
Could you also please export module configuration from config:/modules/advanced-cache/config/ and attach it to the issue? Thanks.

The detailed changes are visible in the "Subversion Commits" tab on this issue. The problem we identified originally was that when policies are chained and wrapped in the delegate classes, module was not able to unwrap them anymore. This is now replaced by obtaining policies configuration data such as last update timestamp directly via module class.

Regards,
Jan

Comment by Christian Kutschke [ 30/May/11 ]

Hi Jan,

see enclosed the requested files. The test setup is the following:

Testing always consists of the following steps:

1. Activate new content to clean cache

2. Change and activate

3. Clear cache again

Bets regards,
Christian

Comment by Christian Kutschke [ 30/May/11 ]

Magnolia DEBUG cache log for info.magnolia.module.cache and info.magnolia.module.advancedcache

Comment by Christian Kutschke [ 30/May/11 ]

Module config for advanced cache

Comment by Christian Kutschke [ 30/May/11 ]

Another thing I noticed:
when first installing the new versions for advanced cache module, there was a JCRException in the log that the timestamps for the policies could not be found. It looked to me like the module was looking for nodes in uppercase (ServeUntilRecachedCachePolicy) but the nodes in the repository were lowercase (serveUntilRecachedCachePolicy).

Also the timestamps look like they are never updated (always 0).

Comment by Jan Haderka [ 31/May/11 ]

Hi Christian,
the change from upper case to lower case was part of changes made for this issue to conform to JavaBean convention and allow C2B to load the config nodes automatically. Unfortunately it seems that the relevant code was not committed prior the release.
I've made the commit and released the snapshot version of the module at http://nexus.magnolia-cms.com/content/repositories/magnolia.enterprise.snapshots/info/magnolia/magnolia-module-advanced-cache/1.3.2-SNAPSHOT/magnolia-module-advanced-cache-1.3.2-20110531.070807-2.jar

Could you please download this jar and replace your current magnolia-advanced-cache-1.3.1.jar with it? Once you confirm the functionality we will release this as version 1.3.2.

Thanks you.
Jan

Comment by Christian Kutschke [ 31/May/11 ]

Hi Jan,

thanks for the update, unfortunately I see no changes in the behavior. Also the values of lastUpdateTimestamp in the module config still never get changed (remains '0' forever). I enclosed the two files (from author and public) for you.

BTW: the caching issue never occurs on two author instances (probably because authors don't cache at all?). But if I change the publisher instance to admin=false, it's the same issue as before.

Best regards,
Christian

Comment by Christian Kutschke [ 31/May/11 ]

Author's module config of advanced cache.

Comment by Christian Kutschke [ 31/May/11 ]

Publishers config of module advanced cache.

Comment by Jan Haderka [ 31/May/11 ]

Content is not cached on the author (and even on public) when you are actually logged in. To understand this better you might want to have a look at config:/modules/cache/config/configurations/default/cachePolicy ... there is whole voter chain setup that allows you to control what will get cached and what not on the very low level.

Back to the issue at hand, could you please export the cache configuration from config:/modules/cache/config/configurations/default so I can double check it is setup properly? Is the error in the logs still the same as before? Could you attach the log (from public instance) from attempt to activate something with the advanced-cache snapshot at public instance?

Thanks.

Comment by Christian Kutschke [ 31/May/11 ]

Cache-Config

Comment by Christian Kutschke [ 31/May/11 ]

Updated Cache Log (new entries starting with 31.05.2011 16:07:13).

Contains 2 activations, both not reflected on 1 node, then clearing of cache and correct result on both nodes.

Comment by Christian Kutschke [ 31/May/11 ]

Hi Jan,

see enclosed the requested files.

Best regards,
Christian

Comment by Jan Haderka [ 03/Jun/11 ]

Hi Christian,
can you please look at the attached image of the flush policy configuration?
It seems to me that in your configuration, you are actually overriding the delegate policy class instead of the FlushAll policy file. Since there are no repositories configured on which to react for the delegate policy, changing it's class to the Notify policy will not work.
HTH,
Jan

Comment by Christian Kutschke [ 06/Jun/11 ]

Hi Jan,

thank you for the pointer - I reinstalled the cache and advanced Cache modules from scratch, reconfigured them following the howto and it seems to work now on my test system - at least it looks promising! I'll have the final result after I can redeploy to our productive environment (as only here we have the real setup with cluster and netscaler). I'll report the results asap.

Thanks for your time and patience - pity if the answer is this bootstrap hiccup.

Best regards,
Christian

Comment by Christian Kutschke [ 06/Jun/11 ]

Could you provide the temporary fix of the advanced cache module as a permanent 1.3.2 in the repository? Thank you!

Comment by Jan Haderka [ 06/Jun/11 ]

Sure. The fix will be released as 1.3.2.

Generated at Sun Feb 11 23:10:11 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.