[MGNLADVCACHE-61] On "Serving from old cache": Serve also the first visitor from old cache, and let the back end trigger the re-rendering for re-caching. Created: 16/Sep/15  Updated: 27/Nov/17  Resolved: 05/Dec/16

Status: Closed
Project: Advanced Cache
Component/s: core
Affects Version/s: 1.7.1
Fix Version/s: None

Type: Improvement Priority: Neutral
Reporter: Christian Ringele Assignee: Jan Haderka
Resolution: Won't Fix Votes: 2
Labels: support
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
Template:
Acceptance criteria:
Empty
Task DoD:
[ ]* Doc/release notes changes? Comment present?
[ ]* Downstream builds green?
[ ]* Solution information and context easily available?
[ ]* Tests
[ ]* FixVersion filled and not yet released
[ ]  Architecture Decision Record (ADR)
Date of First Response:
Visible to:
Vivian Steller, zzzzJens Kolb (Lemonize) (Inactive)

 Description   

The idea comes form a support case: SUPPORT-5122

When using the strategy "Serving from old cache":
After the cache flush, the first hit of a page is served form new cache which triggers the re-rendering and re-aching.
All others get served from the old cache until the new entry is there.

Now an idea, inspired by the "'Eager re-caching" strategy:
Serve also the first visitor from the old cache, and trigger the re-rendering and re-caching in the back-end decoupled.

So everybody gets all the time fast responses, instead of having a first visitor waiting.



 Comments   
Comment by Jan Haderka [ 25/Sep/15 ]

This is exactly why the eager re-caching strategy exists so that none have to wait.

Comment by Michael Büchele [ 25/Sep/15 ]

@Jan
Please have a look at SUPPORT-5122.
Correct me if I'm wrong, but the eager re-caching won't work well for all pages of a website (performance, memory etc.)?
The documentation says, that a default is top 100 pages - so what about the rest?
We are looking for a solution, that works with all pages.

Comment by Jan Haderka [ 25/Sep/15 ]

You should be able to configure number of items for which this happens by setting eagerRecache property on the flush policy. You should also really do this only for often accessed pages rather than blindly for everything as you might end up using big amount of extra resources (memory, cpu) on the server and at some point benefits of recaching will be totally offset by the extra load.

Just for clarification, the technical problems behind this particular request

  • we would need to keep both copies of cache pretty much forever and aggregate them together when new publications happen. Early tests have shown that in generic case amount of resources that will end up allocated to such task might cause issues w/ stability (using out all available memory on the server).
  • the request spawned for recaching while serving the old version would return old version to client way before recaching is finished and it's thread would be returned to the pool and possibly reused for other requests. Thus the thread spawned from it to perform recaching request might be affected by execution or errors happening on main thread (or killed if container decides do downsize current worker thread pool). So extra handling and externally available thread pool for this would need to be created for such case to work correctly (this is different from the EagerRecache where we initiate recaching from the flush policy and are not working w/ http worker threads.

That doesn't mean that in some specific case, with just few and really slow pages on the server it can't work, only that such solution is not applicable to most/all scenarios utilised by clients.

If you really want to force it you are free to extend existing ServeUntilRecachedPolicy and combine it with EagerRecache but be aware of the drawbacks mentioned above.

Comment by Michael Büchele [ 25/Sep/15 ]

So - to sum it up - there is currently no proper solution to always serve cached pages to visitors?

Comment by Christian Ringele [ 09/Feb/16 ]

( I re-opened)

I'm not happy with the answer "This is exactly why the eager re-caching strategy exists so that none have to wait.".
Because the question is, why the "Serving from old cache" is also not doing the same?
If one needs to use "Serving from old cache" and can not use the "eager re-caching" strategy, this answer doesn't help much.

Why letting the first request going though the new cache, instead of doing it by the backed?
What speaks against it? In the last training I was asked exactly this:
"What if the request takes 4 seconds? Why letting somebody wait 4 seconds, instead of letting the back-end doing it?". (And see comment above)

And the last input:
Why the separation of "Serving from old cache" and "eager re-caching" strategy?
Why could be both used as a combination in one strategy?
I don't see why they should conceptually and technically exclude each other?
Each strategy has benefits, and the combination/merge into one would even be better.

Cheers
Christain

Comment by Jan Haderka [ 09/Feb/16 ]

Why the separation of "Serving from old cache" and "eager re-caching" strategy?

Currently we provide the two strategies. What you are asking for is the third one. Of course it can be developed, however even with that we do not cover all possible imaginable strategies.

Why could be both used as a combination in one strategy?

The current demand for described strategy is too low to develop it in comparison to other requested new features.

I don't see why they should conceptually and technically exclude each other?

Technically, we don't support nesting of different strategies. You have to choose one that you want to use. Of course you are free to write custom strategy that combines both.

Each strategy has benefits, and the combination/merge into one would even be better.

If none of the out-of-the-box provided strategies fits your needs, you are free to look at the existing strategies and develop and deploy custom strategy that does exactly what desired.

Comment by Michael Büchele [ 10/Feb/16 ]

Hi guys,
I'm speaking as a customer here - and hope to increase the demand by this post

1)
I wonder how you analyze the demand for caching optimization at all?
Pretty much all projects have to deal with caching and performance issues - you might just not get support tickets from each one?
From my point of view there is demand for sure not only by customers (me) but also from developers (training of Christian) and hosting partners (see SysEleven "SuperCache").
Not to mention that performance is one of the four major areas you make advertising with...

2)
Here is a quote from https://www.magnolia-cms.com/magnolia/performance.html:

"Magnolia's enterprise caching offers various options for caching dynamic and personalized content. Minimize server load and ensure that fresh content is served to visitors."

We have been trying to find a solution to our cache problem for about 5 months now, and are really disappointed that no proper solution seems to be available, and not even the demand is "accepted". Feels like we're left alone here and should figure out something ourselves

3)
The original question is still not answered:

"Why letting the first request going though the new cache, instead of doing it by the backed?"

We are not desiring anything fancy here - just that no "human visitor" should have to wait for a page to be built - that should always be done by the backend.

Thanks a lot and hopefully this ticket will not be closed unsolved once again...

Comment by Christian Ringele [ 31/May/16 ]

I'm still not sure if we speak about the same, what I mean is:
"Serving from old cache" is serving the old cache while the page is rendered again for re caching.
But this re-rendering is happening on a end user request, and not an autonomous dropped request by the system (as the "eager re-caching" strategy does for all important pages by journal).
My concrete question is:
Why not let the re-rendering be done by the system, instead of the user request.

What I dislike:
Lets assume a certain type of page is cache-able, but form the nature of its back end logic one knows, that it will take 8-15 seconds to render the first time.
This means: After cache flush, one user will always have to wait 8-15 seconds until he sees the page.
Why? It just doesn't make sense.
Why not let him also be served form the old content, and drop the request internally form the system?
This is for me not a merge of the two strategies, its just a improvement of a flaw of the "Serving from old cache".
Also in many cases not using "Serving from old cache" but switching to "eager re-caching" is not an option as the hit distribution of the site doesn't match the strategy's core (journal based stats).

Please answer the question(s) above directly.
All you answers have been very generally, but I'd be interested in the specific point I try to make (its just an improvement, not a merge or third strategy).

---------------------------------------
---------------------------------------
Quotes form above comment

----------> I don't see why they should conceptually and technically exclude each other?

Technically, we don't support nesting of different strategies. You have to choose one that you want to use. Of course you are free to write custom strategy that combines both.

As explained above I don't think about a combination, nor a third strategy.
For me its a pure improvement of the existing "Serving from old cache".

----------> Why could be both used as a combination in one strategy?

The current demand for described strategy is too low to develop it in comparison to other requested new features.

If you think I ask for a third strategy yes, but I don't.
And I doubt that people communicate that flaw of the "Serving from old cache" as a demand.
If you ask all our customers following:
You are using "Serving from old cache" cache strategy. The cache is flushed and a page is requested the first time.
Choose between what you would prefer:
1. After the first request of the end user: He is blocked for 8-14 seconds until the page is rendered, while all others get served fast from the old cache.
2. After the first request of the end user: He gets as all other users served fast from the old cache, 8-14 seconds later all users will served the new page from new cache.

You really want to tell me that customers would no be demanding option 2 if they knew?
I have the suspicion that for such long rendering pages they just search different & custom solutions and don't communicate this, as they can't break the problem down to a implementation flaw/easy change of this flaw.

Comment by Michael Büchele [ 31/May/16 ]

Comment by Jan Haderka [ 05/Dec/16 ]

Yes, all the reasons why yet another custom policy is required for such corner cases are valid. And there's even more of other corner cases that you have not listed that would require yet more custom policies. For those reasons, policies are fully customizable and every user can implement the one that solves their corner case, while limited amount of base policies covers only most common use cases.
The case described here have all the hallmarks of a corner case. It has not managed to attract attention and demand from other users over the time it has been described here. It has also not been reported or requested by wider community, hence it will not be implemented any time soon. To move forward, the best course of action is to implement it as custom caching policy.

Generated at Sun Feb 11 23:10:37 CET 2024 using Jira 9.4.2#940002-sha1:46d1a51de284217efdcb32434eab47a99af2938b.