[MGNLCACHE-254] Config to cache responses larger than 500KB Created: 01/Apr/22 Updated: 18/Apr/23 Resolved: 31/Mar/23 |
|
| Status: | Closed |
| Project: | Cache Modules |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 5.9.5 |
| Type: | Improvement | Priority: | Major |
| Reporter: | Tomáš Gregovský | Assignee: | Chuong Doan Huy |
| Resolution: | Done | Votes: | 1 |
| Labels: | None | ||
| Σ Remaining Estimate: | Not Specified | Remaining Estimate: | Not Specified |
| Σ Time Spent: | 3d 7h | Time Spent: | 3d 7h |
| Σ Original Estimate: | Not Specified | Original Estimate: | Not Specified |
| Attachments: |
|
||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||
| Sub-Tasks: |
|
||||||||||||||||||||||||||||||
| Template: |
|
||||||||||||||||||||||||||||||
| Acceptance criteria: |
Empty
|
||||||||||||||||||||||||||||||
| Task DoD: |
[X]*
Doc/release notes changes? Comment present?
[X]*
Downstream builds green?
[X]*
Solution information and context easily available?
[X]*
Tests
[X]*
FixVersion filled and not yet released
[ ] 
Architecture Decision Record (ADR)
|
||||||||||||||||||||||||||||||
| Release notes required: |
Yes
|
||||||||||||||||||||||||||||||
| Documentation update required: |
Yes
|
||||||||||||||||||||||||||||||
| Date of First Response: | |||||||||||||||||||||||||||||||
| Epic Link: | Support | ||||||||||||||||||||||||||||||
| Sprint: | DevX 34 | ||||||||||||||||||||||||||||||
| Story Points: | 3 | ||||||||||||||||||||||||||||||
| Team: | |||||||||||||||||||||||||||||||
| Work Started: | |||||||||||||||||||||||||||||||
| Approved: |
Yes
|
||||||||||||||||||||||||||||||
| Description |
|
We are using delivery endpoint, here is the definition:
class: info.magnolia.rest.delivery.jcr.v2.JcrDeliveryEndpointDefinition workspace: blog_en_blogs2 limit: 100000 referenceDepth: 100 depth: 0 nodeTypes: - mgnl:composition references: - name: imageReference propertyName: bannerImage referenceResolver: class: info.magnolia.rest.reference.dam.AssetReferenceResolverDefinition assetRenditions: - 450 - 900 - name: authorReference propertyName: authors referenceResolver: targetWorkspace: authors class: info.magnolia.rest.reference.jcr.JcrReferenceResolverDefinition - name: categoriesReference propertyName: categoriesFilter referenceResolver: class: info.magnolia.rest.reference.jcr.JcrReferenceResolverDefinition targetWorkspace: category this endpoint supposed to be cached but is not. when checking cache tools app, there is an entry for this endpoint, but can not be downloaded. My assumption is this entry is empty. On the front end, endpoint it being requested without any parametr. The loading time of this file is usually between 1.5 and 4 seconds (depends on connection) but thats a lot for 134kB file. BTW browser says this endpoint has 134kB but it seems to be after compression and 1.1MB before compression:
|
| Comments |
| Comment by Jaroslav Simak [ 05/Apr/22 ] |
|
Cache threshold is hardcoded here: info.magnolia.module.cache.filter.CacheResponseWrapper#DEFAULT_THRESHOLD.
|
| Comment by Christopher Zimmermann [ 05/Apr/22 ] |
| Comment by Christopher Zimmermann [ 05/Apr/22 ] |
|
Seems like we need to lower the threshold - or make it more configurable - or best could it be based on the time of computation rather then just size of response? Like if its over 1 second of compute - it should be cached? |
| Comment by Christopher Zimmermann [ 12/May/22 ] |
|
But is that CacheResponseWrapper aabout Magnolia caaching the response... or about putting cache headers on the response so that the browser caches it? |
| Comment by Christopher Zimmermann [ 12/May/22 ] |
|
tgregovsky - your concern is about Magnolia doing the caching, correct? (Not about browser caching?) |
| Comment by Tomáš Gregovský [ 12/May/22 ] |
|
hi czimmermann , yes - Magnolia server side caching (same like pages) ... (some delivery endpoint are to big, taking couple of seconds to be loaded and then they are being loaded on every visit = performance issue for Magnolia) |
| Comment by Christopher Zimmermann [ 16/May/22 ] |
|
jsimak from this page: https://docs.magnolia-cms.com/product-docs/6.2/Modules/List-of-modules/Cache-modules/Cache-core.html#_in_memory_threshold Do you understand this paragraph?
Is this implying that if a response is under 500K that its more efficient to get from the repository than from memory? This seems unlikely to me. I'm wondering what the threshold is for, I would expect all responses to be faster from the in-mem cache. ANd not just be faster but reduce load on the the templating/rendering system. |
| Comment by Christopher Zimmermann [ 18/May/22 ] |
|
tgregovsky we are going to look into this in the sprint starting next week (May 23) - to see if thiere is a bug - why there is this 500Kb threshold - and how to improve caching behaviour. In the meantime some things yoou could do depending on how urgent this is: OPtion 1: Change your REST requests to have a size of less than 500KB.. If its less than 500 then it should be cached. Option 2: Create a custom REST endpoint (requires Java) and in there change the threshold value as mentioned here: https://docs.magnolia-cms.com/product-docs/6.2/Modules/List-of-modules/Cache-modules/Cache-core.html#_in_memory_threshold "You can still change this value programmatically, for example, in your custom renderer which does time-consuming operations:" |
| Comment by Jaroslav Simak [ 06/Jun/22 ] |
|
We will not increase the threshold, if there is a need for JSON responses larger than 500KB, then developers should use pagination. |
| Comment by Tomáš Gregovský [ 07/Jun/22 ] |
|
hi jsimak , thats pity to be honest. just to sum up again: there is an endpoint which has 134kB (not to big in my opinion) but it is taking sometimes even up to 4 seconds to receive. Probably due to this data are calculated (jcr query, etc) before they are returned, every single time for every response. In our usecase we need to receive all the data in once (pagination is not an option) and also since you can't specify which subnodes could be and which could not be part of json data, the fact that endpoints can't be cached are making use of delivery endpoints in production quite hard |
| Comment by Michael Schneider [ 05/Jul/22 ] |
|
Hi jsimak, can you have a look at Tomas' last comment? We could use some support on this issue as described above. Thanks, Michi |
| Comment by Pierre Sandrin [ 04/Aug/22 ] |
|
Hello, we are running into the same issue on one of our Headless Projects. With all the resolved references the response for the homepage becomes 1.6Mb with a response time of about 1.2s. The CPU load is quite high when these requests are not cached. Isn't caching the more important the larger (and costly) the response is? The 500k threshold doesn't make any sense to me. Would appreciate if you could try to find a solution for this since we are not the only ones having a Problem. Otherwise we have to throw the delivery endpoints in the trash which would be a pity because they are so cool! |
| Comment by Chuong Doan Huy [ 16/Mar/23 ] |
|
czimmermann Yes, Magnolia cache is currently utilizing Ehcache3 which already implemented the removal when content exceeds limit. Discovery : + Something we can do : **Additional concern (need to verify) : threshold calculation mismatch (Postman indicate response size much less than threshold but calculation in code still mark it over threshold) |
| Comment by Christopher Zimmermann [ 16/Mar/23 ] |
|
Thanks chuong.doan. I think making threshold configurable seems like a good path. Configuration must be easy to do and ideally via light development YAML file. What do you think pierre and tgregovsky ? One question is where to configure it.. globally somehow? Or on the Delivery endpoints? Regarding "threshold calculation mismatch". Maybe due to compression over HTTP? https://en.wikipedia.org/wiki/HTTP_compression You mention cache limit determined by # of entries (10000), I would have thought the cache limit should rather be set by amount of memory to allocate to the cache. Is that also an option? Would that not be safer, as far as preventing the cache from overwhelming the server? In general my hunch is that this very large response size is rather an edge case. I would not think adding Disk cache is necessary at this time, unless there seems to be a need from someone. |
| Comment by Pierre Sandrin [ 17/Mar/23 ] |
|
1) Making it configurable would be a solution for us. Per Endpoint or in the cache config is ok. 2) I agree with Cristopher that it is probably the gzip compression that reduces the size. The uncompressed version is the value that counts for the threshold. 3) I'm not too much concerned about "exploding" the cache since the are not many different rest requests to be cached that are so big. It's mainly the one to the /home page that is very big. 4) I agree that if possible the cache limit should be set as an amount of Memory |
| Comment by Chuong Doan Huy [ 17/Mar/23 ] |
|
1) Configured in the cache config so that it's consistent with others cache configurations. Config per endpoint would require a more complex refactor. So i would opt for global config in the cache. For viet.nguyen comment : |
| Comment by Christopher Zimmermann [ 20/Mar/23 ] |
|
I think I would leave the current default as is (using ENTRIES). Reason, I would hate to impact an existing customers instance negatively. If a developer configures a new limit for size of items to be cached - they can also configure those other values. Do you see any problems if we keep the default as is? If we change the default, I would change it in 6.3 |
| Comment by Chuong Doan Huy [ 20/Mar/23 ] |
|
Thanks Topher. No problem keeping the default, even better IMHO. |